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Abstract We have employed NMR to investigate the structure 
of SARS coronavirus nucleocapsid protein dimer. We found that 
the secondary structure of the dimerization domain consists of 
five a helices and a p-hairpin. The dimer interface consists of a 
continuous four-stranded p-sheet superposed by two long a heli¬ 
ces, reminiscent of that found in the nucleocapsid protein of por¬ 
cine respiratory and reproductive syndrome virus. Extensive 
hydrogen bond formation between the two hairpins and hydro- 
phobic interactions between the p-sheet and the a helices render 
the interface highly stable. Sequence alignment suggests that 
other coronavirus may share the same structural topology. 

© 2005 Published by Elsevier B.V. on behalf of the Federation of 
European Biochemical Societies. 
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1. Introduction 

Severe acute respiratory syndrome (SARS) is the first emerg¬ 
ing infectious disease in the 21st century with a fatality rate of 
ca. 8% and is caused by a novel coronavirus (CoV) [1]. The 
nucleocapsid protein is a key component of the virus and is 
essential for virus formation. It binds to the viral RNA to form 
a ribonucleoprotein core, which can enter the host cell and 
interact with cellular processes [2-5]. The free protein presum¬ 
ably exists as a dimer in solution, with the dimerization do¬ 
main located at the C-terminus [6,7]. We have previously 
defined the structural domains of the SARS-CoV N protein 
[8]. The C-terminal structural domain coincides with the 
dimerization domain identified in previous studies, and our 
biochemical studies showed that it exists as a dimer in solution. 
Denaturation studies have shown that dissociation of the 
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SARS-CoV N protein is coupled with loss of structure, imply¬ 
ing a structure-dependent mechanism for self-association [9]. 
Understanding this mechanism would not only provide in¬ 
sights into the viral assembly process, but also identify addi¬ 
tional targets for drugs to combat SARS through disruption 
of N protein self-association. However, there has been no 
3D structure of the dimerization domain of coronavirus N 
protein published and the underlining principle governing the 
self-association of coronavirus N protein dimer is also un¬ 
known. We have employed nuclear magnetic resonance 
(NMR) techniques to investigate the structure of the dimeriza¬ 
tion domain of SARS-CoV N protein. We report our results in 
this communication. 


2. Materials and methods 

2.1. Plasmid construction 

SARS-CoV TW1 strain cDNA clones were kindly provided to us by 
Dr. P.-J. Chen of National Taiwan University Hospital [10]. The a 
SARS-CoV nucleocapsid protein fragment consisting of residues 
281-365 (NP 28 i_ 365 ) and a SARS-CoV nucleocapsid protein fragment 
consisting of residues 248-365 (NP 248 _ 365 ) clones were obtained by 
polymerase chain reaction (PCR) on a RoboCycler Gradient 96 (Strat- 
agene, CA, USA) using appropriate primers. The resulting PCR frag¬ 
ment contained an Ncol site at one end and a BamHl site at the other. 
After restriction enzyme digestion, the resulting fragment was cloned 
into the pET6H plasmid, which contains a His-tag coding region. 
The resultant protein fragment included an extra MHHHHHHAMG 
sequence at the N-terminus. 

2.2. Protein expression and purification 

The fragments corresponding to residues 248-365 (NP 248 _ 365 ) and 
281-365 (NP 281 „ 365 ) of SARS-CoV N proteins were expressed in Esch¬ 
erichia coli BL21(DE3) strain. Isotopically labeled protein samples 
were prepared by culturing the cells in standard M9 media, supple¬ 
mented with 15 NH 4 C1 (1 g/L) (For 15 N-labeling) and/or u- 13 C-glucose 
(2 g/L) (For C-labeling) and appropriately labeled Isogro (0.5 g/L) 
(Isotec, OH, USA). Perdeuterated isotopically labeled protein samples 
were prepared by culturing the cells in the same media in D 2 0 (80% 
D 2 0 for samples used in filtered experiments) and supplemented with 
deuterated Isogro and glucose. Deuteration rates for all clones were on 
the order of 85% (65% for samples used in filtered experiments) as mea¬ 
sured by mass spectrometry. The cells were broken with a microflui- 
dizer and the protein purified through a Ni-NTA affinity column 
(Qiagen, CA, USA) in buffer (50 mM sodium phosphate, 150 mM 
NaCl, and pH 7.4) containing 7 M urea. The protein was then allowed 
to refold by dialysis in liquid chromatography buffer (50 mM sodium 
phosphate, 150 mM NaCl, 1 mM EDTA, 0.01% NaN 3 , and pH 7.4). 
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Fig. 1. (A) 15 N-HSQC spectrum of u- 15 N-NP 2 4 b- 365 - (B) Summary of the NMR parameters employed for secondary structure prediction. Dots at the 
top indicate residues’ NH protons are protected from deuterium exchange after 24 h. (C) Secondary structure profile of the SARS-CoV N protein. 
The two shaded areas represent the N-terminal and C-terminal structural domains. Secondary structure of the N-terminal domain was adapted from 
Huang et al. [27]. 
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Renatured protein was loaded onto an AKTA-EXPLORER fast per¬ 
formance liquid chromatography (FPLC) system equipped with a Hi- 
Load 16/60 Superdex 75 column (Amersham Pharmacia Biotech, 
Sweden). Complete Protease Inhibitor cocktail (Roche, Germany) 
was added to the purified protein. Protein concentration was deter¬ 
mined with the Bio-Rad Protein Assay kit as per instructions from 
the manufacturer (Bio-Rad, CA, USA). The correct molecular weight 
of the expressed protein was then confirmed by mass spectroscopy. 

2.3. NMR spectroscopy 

Protein samples for NMR experiments contain between 0.5 and 
3 mM protein in NMR buffer (10 mM sodium phosphate buffer, pH 
6.0, containing 50 mM NaCl, 1 mM EDTA, 1 mM 2,2-dimethyl-2-sila- 
pentane-5-sulfonate (DSS), 0.01% NaN 3 , 10% D 2 0 and complete pro¬ 
tease inhibitor cocktail). All NMR data were acquired at 30 °C on 500, 
600 or 800 MHz Bruker AVANCE spectrometers equipped with a tri¬ 
ple resonance TXI cryoprobe with an actively-shielded Z-gradient. 
Experimental parameters were set as described previously [11,12]. 
Sequential backbone resonance assignments for 'H^, 15 N, 13 C a and 
13 Cp were derived from standard 3D HNCA, HN(CO)CA, HNCO, 
HN(CA)CO, CBCANH, and CBCA(CO)NH experiments [13]. 
(H m )C m CH-TOCSY experiments were also obtained to correct for iso- 


1 T 

tope effect on C chemical shifts [14]. The assignments of H a and H p 
resonances were achieved from analysis of the HBHA(CO)NH spec¬ 
trum. H(CC)(CO)NH, CC(CO)NH and HCCH-TOCSY spectra were 
analyzed to obtain side chain assignments. To identify the interface re¬ 
gion involved in dimer interactions, Fi[ 13 C, 15 N]-filtered, F 3 - 15 N-edited 
and Fi[ 13 C, 15 N]-filtered, F 3 - 13 C-edited 3D NOESY-HSQC spectra 
were obtained using a (u-“H, C, N)NP 248 _ 365 (65% deuteration)/ 
NP 248 365 hetero-dimer sample prepared by mixing labeled NP 248 _ 365 
sample with equal amount of unlabeled NP 248 _ 365 sample [15,16]. 
The protein was denatured in 8 M urea and renatured by extensive 
dialysis in desired NMR buffer conditions. Data were processed with 
the XWINNMR suite and AURELIA software (Bruker, Germany) 
on SGI workstations or NMRPipe on Linux workstations [17]. The 
*H chemical shift was referenced to DSS at 0 ppm as suggested [18]. 


2.4. Static light scattering 

Protein samples in NMR buffer were diluted to a concentration be¬ 
tween 0.5 and 2 mg/ml. Prior to loading into the cuvette, samples were 
filtered through a 0.22 pm-cutoff filter. Data were acquired on a Dyn- 
aPro MS/X light scattering system equipped with a fixed-angle detector 
(Protein Solutions, NJ, USA) at 4 °C. Analysis was carried out on the 
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Fig. 2. (A) Stripe plots showing the intermolecular NOE connectivities in the P-sheet (left panel) and between the side chain resonances of residues in 
the p 2 strand of one monomer (labeled on top of the stripes) and side chain resonances of residues in E helix in the other monomer (indicated by 
arrows) (right panel). The stripes on the left panel were selected from the Fi[ 13 C, 15 N]-filtered, F 3 - 15 N-edited 3D NOESY-HSQC spectrum and the 
stripes on the the right panel were selected from the F![ 13 C, 15 N]-filtered, F 3 - 13 C-edited 3D NOESY-HSQC. (B) NOE connectivities of the P-sheet 
forming the dimer interface. The shaded arrows and the connecting loops represent the two p hairpins of the two monomers. The two-headed arrows 
show the observed NOE pairs and the dotted lines are the proposed hydrogen bonds stabilizing the p hairpins, as well as the dimer interface between 
the two p hairpins. The dotted rectangular boxes represent the positions of the two helices which interact with the four-stranded P-sheet. The boxed 
residues are those involved in hydrophobic interaction with the helices. The NOEs between pi and P2 (also pi' and P2' in the other monomer) were 
obtained from 3D 15 N-NOESY-HSQC spectrum of u- 15 N-NP 248 _ 365 sample and the interfacial NOEs between p2 and p2' were obtained from 15 N- 
filtered 3D NOESY-HSQC spectrum of sample containing u-( 2 H, 13 C, 15 N)-NP 248 _ 365 (65% Deuteration)/unlabeled NP 248 _ 365 hetero-dimer. 
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Dynamics V6 program suite included in the system on an IBM PC- 
compatible computer. Concentration effects were corrected within 
the program. 

2.5. Chemical cross-linking 

The homobifunctional amine cross-linker disuccinimidyl suberate 
was purchased from Sigma-Aldrich (MO, USA) and prepared in 
V,V-dimethylformamide (DMF) to a concentration of 25 mg/ml. 
Reactions were carried out with a final protein concentration of 
0.35 mM and final cross-linker concentration of 5 mM. Mock reac¬ 
tions were set up as control which contained only the protein solution 
and DMF without cross-linker. The reaction mixtures in NMR buffer 
were allowed to react for 1 h at 4 °C prior to quenching with 100 mM 
glycine. The results were visualized on SDS-PhastGel minigels (Phar¬ 
macia Biotech, Sweden). 


3. Results and discussion 

3.1. Secondary structure of the dimerization domain 
We have previously shown that SARS-CoV N protein con¬ 
sists of two structured domains, the RNA binding domain 
(a.a. 45-181) and the dimerization domain (a.a. 248-365), with 
the remainders of the sequence existing in disordered state [8]. 
NP 248 365 is the most stable domain which retains the dimer 
structure. Shortening the fragment causes structural changes 
and lengthening the fragment has no effect on the structure 
of NP 2 48 - 365 - Backbone assignment for most amino acids of 
NP 2 48 365 was achieved except for residues located at the N- 
terminus and H301 (Fig. 1A). Perdeuteration of NP 248 _ 365 
was necessary to obtain triple-resonance spectra due to short 
T 2 (transverse relaxation time) of the dimer. The secondary 
structure of NP 248 365 was determined from standard NMR 
parameters, such as the characteristic NOE patterns, the con- 
sensus chemical shift indices (CSI), the magnitude of the / H Not 
value, and the exchange rates (Fig. IB). The result shows 


that NP 248 ^ 365 consists of five a-helices (A, Val271-Phe274; 
B, Gln290-Gln295; C, Trp302-Phe308; D, Ala312-Gly317 
and E, Phe347-Ala360) and two (3 strands ((31: Arg320- 
Thr326; (32: Gly329-Leu340) (see Fig. 1C). 

3.2. The dimer interface is composed of a [3-sheet stabilized by 
helix E 

The short T 2 of the NP 248 ^ 365 dimer (MW = 28 kD) pre¬ 
cluded full assignment of side chain resonances due to weak 
signals even on cryoprobe-equipped spectrometers, thus ham¬ 
pered complete 3D structure determination of the dimer to 
high resolution. However, most H a , Hp and aliphatic side- 
chain nuclei could be assigned from (H m )C m CH-TOCSY, 
HCCH-TOCSY and 13 C-edited NOESY spectra. Further anal¬ 
ysis of the intramolecular dimer-interface NOEs identified a 
number of contacts between the two |3 strands, which allowed 
us to define the (3 hairpin structure (Fig. 2A). This information 
allowed us to manually assign the intermolecular NOEs at the 

101c 

dimer interface from analysis of the Fi[ C, N]-filtered, 
F 3 - 15 N-edited and F^^C, 15 N]-filtered, F 3 - 13 C-edited 3D 
NOESY-HSQC spectra using u-( 2 H, 13 C, 15 N)NP 248 _ 365 (65% 
deuteration)/unlabeled-NP 248 ^ 365 sample. Our results indicate 
that the dimer interface is composed of a continuous four- 
stranded (3-sheet, formed by extensive hydrogen bond interac¬ 
tions between the two long (3 strands of the two (3-hairpins, 
contributed one from each of the two monomers (Figs. 2B 
and 3A). The dimer is further stabilized by hydrophobic inter¬ 
actions between residues on one side of the amphipathic long 
helix E and the (3-sheet (Fig. 3B). The presence of the extensive 
interactions between the two monomers in a dimer provides 
strong stabilization force and explains why the two monomers 
cannot be separated without denaturing the protein, even 
though there is no cysteine in the dimerization domain to form 
a covalent disulfide bond. This arrangement is reminiscent of 
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Fig. 3. (A) Schematic representation of the structure of the dimer interface of SARS-CoV N protein. The relative orientation between the anti¬ 
parallel (3-sheet and the E helix is defined by the six NOEs identified as shown on Fig. 2 A. Residues involved in these NOEs are shown in stick and 
ball representations. (B) Helical wheel plot of helix E, showing the amphipathic nature of the helix. The hydrophobic face is defined by the four 
hydrophobic residues (colored green). (C) Ribbon representation of the structure of the dimer interface of the C-terminal domain of the nucleocapsid 
protein of porcine reproductive and respiratory syndrome virus (PRRSV) (PDB ID: 1P65). (D) Ribbon representation of the structure of the dimer 
interface of the capsid protein of bacteriophage MS2 (PDB ID: 1AQ3). The ribbon representations are prepared with the MOLMOL program. 
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the dimer-interface of the porcine reproductive and respiratory 
syndrome virus (PRRSV) nucleocapsid protein (Fig. 3C) [19], 
the coat protein of bacteriophage MS2 (Fig. 3D) [19] and the 
peptide recognition domain of the human histocompatibility 
antigen (HLA) [20,21]. 

3.3. The stable dimer interface is an ideal common building block 
for dimer interfaces 

It has been postulated that although coronaviruses are evolu¬ 
tionary related to arteriviruses, the large size discrepancy of 


their nucleocapsid proteins most likely implied that they had 
different folds [22]. However, we show that there are common 
principles that underlie the architecture of a nucleocapsid pro¬ 
tein in both SARS-CoV and PRRSV. They both contain two 
regions, one for RNA-binding and the other for dimerization 
or oligomerization [8,23], albeit the two domains in SARS- 
CoV N protein is linked by a much longer flexible linker of 
~120 a.a. Most importantly, the structures of the dimer 
interface of the two viruses are very similar. The presence of 
extensive interactions in the dimer interface may render this 


A 



Radius 

(nm) 

MW 

(kD) 

%Mass 

N248-365 

2.6 

32 

100.0 

N281-365 

2.4 

26 

100.0 


1 12 3 4 

30 kD - — 

14 kD - 9W ^ m 


C N248-365 N281-365 



Fig. 4. (A) Light scattering results of NP 248 _ 365 and NP 281 _ 365 . Estimated particle radii and molecular weights are listed. (B) Chemical cross-linking 
of NP 248 _ 365 (lanes 1 and 2) and NP 28 i_ 365 (lanes 3 and 4). Lanes 1 and 3: without cross-linker. Lanes 2 and 4: with cross-linker. (C) 15 N-edited HSQC 
spectra of NP 248 _ 365 (left) and NP 281 _ 365 (right). 
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Fig. 5. Alignment of the amino acid sequences of various coronavirus N proteins. The alignment shows only the regions corresponding to the dimer 
interface region of SARS-CoV. From top to bottom: SARS-CoV, porcine transmissible gastroenteritis virus (TGEV), feline coronavirus (FCoV), 
human coronavirus strain 229E (HCoV 229E), bovine coronavirus (BCoV), human coronavirus strain OC43 (HCoV OC43), porcine 
hemagglutinating encephalomyelitis virus (PHEV), murine hepatitis virus 1 (MHV-1) and avian infectious bronchitis virus (IBV). JPred secondary 
structure predictions of the sequences are shown below the sequences. E and H represent the predicted secondary structure of a particular amino acid 
as P-strand or a helix, respectively. 
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dimer interface to become self-contained, i.e., it is less likely to 
be dependent on the global structure of the protein. In fact, we 
have found that a.a. NP 2 gi _365 still retained dimerization 
potential, even though structural differences were evident from 
NMR spectra (Fig. 4). Similar observations have been reported 
in the literature [6]. It is evident that the tight dimer interface 
structure permits certain extended perturbation in the global 
structure without affecting the dimer structure. Such character¬ 
istics make this particular fold ideal as a common building 
block for dimer interfaces in a variety of proteins. 

3.4. The dimerization mechanism may be common among 
coronavirus nucleocapsid proteins 
To investigate whether all coronavirus N proteins share this 
dimerization mechanism we have used ClustalX to align the 
sequences of other coronaviruses [24] and the resulting sub¬ 
sequences were then submitted to the JPred server for second¬ 
ary structure prediction [25]. Sequence alignment coupled with 
secondary structure prediction show that many share the (3|3a 
topology observed in SARS-CoV (Fig. 5). In particular, the 
long [3-strand and the long C-terminal helix are predicted to 
be present in all cases. Most of them also contain the short 
P-strand, with the exceptions of BCoV, HCoV and PHEV. 
These results raise the possibility that all coronavirus employ 
the same interface mechanism for dimerization and they be¬ 
long to the same structural class, however this cannot be ver¬ 
ified by the class-dependent prediction algorithm because of 
the lack of known tertiary structure [26]. 

In conclusion, we have determined the secondary structure of 
the dimerization domain of SARS-CoV N protein and have 
mapped out the residues involved in the interface. We show that 
the interface of SARS-CoV N protein dimer is a four-stranded 
P-sheet, superposed by two long helices. The topology closely 
resembles that of the PRRSV nucleocapsid protein and the coat 
protein of bacteriophage MS2. This type of dimer interfaces is 
highly stable and could serve as one of the common building 
blocks for dimer interfaces in nature. Sequence alignment and 
secondary structure prediction suggest that other coronavirus 
N proteins also adopt a similar dimerization mechanism. 
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